Search CORE

9 research outputs found

STREAM-EVOLVING BOT DETECTION FRAMEWORK USING GRAPH-BASED AND FEATURE-BASED APPROACHES FOR IDENTIFYING SOCIAL BOTS ON TWITTER

Author: Alothali Eiman
Publication venue: Scholarworks@UAEU
Publication date: 01/06/2023
Field of study

This dissertation focuses on the problem of evolving social bots in online social networks, particularly Twitter. Such accounts spread misinformation and inflate social network content to mislead the masses. The main objective of this dissertation is to propose a stream-based evolving bot detection framework (SEBD), which was constructed using both graph- and feature-based models. It was built using Python, a real-time streaming engine (Apache Kafka version 3.2), and our pretrained model (bot multi-view graph attention network (Bot-MGAT)). The feature-based model was used to identify predictive features for bot detection and evaluate the SEBD predictions. The graph-based model was used to facilitate multiview graph attention networks (GATs) with fellowship links to build our framework for predicting account labels from streams. A probably approximately correct learning framework was applied to confirm the accuracy and confidence levels of SEBD.The results showed that the SEBD can effectively identify bots from streams and profile features are sufficient for detecting social bots. The pretrained Bot-MGAT model uses fellowship links to reveal hidden information that can aid in identifying bot accounts. The significant contributions of this study are the development of a stream based bot detection framework for detecting social bots based on a given hashtag and the proposal of a hybrid approach for feature selection to identify predictive features for identifying bot accounts. Our findings indicate that Twitter has a higher percentage of active bots than humans in hashtags. The results indicated that stream-based detection is more effective than offline detection by achieving accuracy score 96.9%. Finally, semi supervised learning (SSL) can solve the issue of labeled data in bot detection tasks

United Arab Emirates University: Scholarworks@UAEU / جامعة الامارات

Using Self-labeling and Co-Training to Enhance Bots Labeling in Twitter

Author: Alashwal Hany
Alothali Eiman
Hayawi Kadhim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/12/2022
Field of study

The rapid evolution in social bots have required efficient solutions to detect them in real-time. In fact, obtaining labeled stream datasets that contains variety of bots is essential for this classification task. Despite that, it is one of the challenging issues for this problem. Accordingly, finding appropriate techniques to label unlabeled data is vital to enhance bot detection. In this paper, we investigate two labeling techniques for semi-supervised learning to evaluate different performances for bot detection. We examine self-training and co-training. Our results show that self-training with maximum confidence outperformed by achieving a score of 0.856 for F1 measure and 0.84 for AUC. Random Forest classifier in both techniques performed better compared to other classifiers. In co-training, using single view approach with random forest classifier using less features achieved slightly better compared to single view with more features. Using multi-view of features in co-training in general achieved similar results for different splits

ZU Scholars (Zayed University)

Hybrid feature selection approach to identify optimal features of profile metadata to detect social bots in Twitter

Author: Alashwal Hany
Alothali Eiman
Hayawi Kadhim
Publication venue: ZU Scholars
Publication date: 01/12/2021
Field of study

The last few years have revealed that social bots in social networks have become more sophisticated in design as they adapt their features to avoid detection systems. The deceptive nature of bots to mimic human users is due to the advancement of artificial intelligence and chatbots, where these bots learn and adjust very quickly. Therefore, finding the optimal features needed to detect them is an area for further investigation. In this paper, we propose a hybrid feature selection (FS) method to evaluate profile metadata features to find these optimal features, which are evaluated using random forest, naïve Bayes, support vector machines, and neural networks. We found that the cross-validation attribute evaluation performance was the best when compared to other FS methods. Our results show that the random forest classifier with six optimal features achieved the best score of 94.3% for the area under the curve. The results maintained overall 89% accuracy, 83.8% precision, and 83.3% recall for the bot class. We found that using four features: favorites_count, verified, statuses_count, and average_tweets_per_day, achieves good performance metrics for bot detection (84.1% precision, 81.2% recall)

ZU Scholars (Zayed University)

Data stream mining techniques: a review

Author: Alashwal Hany
Alothali Eiman
Harous Saad
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/04/2019
Field of study

A plethora of infinite data is generated from the Internet and other information sources. Analyzing this massive data in real-time and extracting valuable knowledge using different mining applications platforms have been an area for research and industry as well. However, data stream mining has different challenges making it different from traditional data mining. Recently, many studies have addressed the concerns on massive data mining problems and proposed several techniques that produce impressive results. In this paper, we review real time clustering and classification mining techniques for data stream. We analyze the characteristics of data stream mining and discuss the challenges and research issues of data steam mining. Finally, we present some of the platforms for data stream mining

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Characteristics of Similar-Context Trending Hashtags in Twitter: A Case Study

Author: Alashwal Hany
Alothali Eiman
Hayawi Kadhim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

© 2020, Springer Nature Switzerland AG. Twitter is a popular social networking platform that is widely used in discussing and spreading information on global events. Twitter trending hashtags have been one of the topics for researcher to study and analyze. Understanding the posting behavior patterns as the information flows increase by rapid events can help in predicting future events or detection manipulation. In this paper, we investigate similar-context trending hashtags to characterize general behavior of specific-trend and generic-trend within same context. We demonstrate an analysis to study and compare such trends based on spatial, temporal, content, and user activity. We found that the characteristics of similar-context trends can be used to predict future generic trends with analogous spatiotemporal, content, and user features. Our results show that more than 70% users participate in location-based hashtag belongs to the location of the hashtag. Generic trends aim to have more influence in users to participate than specific trends with geographical context. The retweet ratio in specific trends is higher than generic trends with more than 79%

ZU Scholars (Zayed University)

Real Time Detection of Social Bots on Twitter Using Machine Learning and Apache Kafka

Author: Alashwal Hany
Alothali Eiman
Hayawi Kadhim
Salih Motamen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/10/2021
Field of study

Social media networks, like Facebook and Twitter, are increasingly becoming important part of most people\u27s lives. Twitter provides a useful platform for sharing contents, ideas, opinions, and promoting products and election campaigns. Due to the increased popularity, it became vulnerable to malicious attacks caused by social bots. Social bots are automated accounts created for different purposes. They are involved in spreading rumors and false information, cyberbullying, spamming, and manipulating the ecosystem of social network. Most of the social bots detection methods rely on the utilization of offline data for both training and testing. In this paper, we use Apache Kafka, a big data analytics tool to stream data from Twitter API in real time. We use profile information (metadata) as features. A machine learning technique is applied to predict the type of the incoming data (human or bot). In addition, the paper presents technical details of how to configure these different tools

ZU Scholars (Zayed University)

Identification and analysis of free games\u27 permissions in Google Play

Author: Alfandi Omar
Alothali Eiman
Belqasmi Fatna
Iqbal Farkhund
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

© 2015 IEEE. Smart phones are becoming more prevalent than ever, and so does the use of game applications for mobile phones. Android platform offers an attractive environment for game developers, as it is open source and it is supported by many of the available smart phones. This paper surveys, analyses, and identifies the most requested permissions when installing a new game application on Android. We base on this analysis to draw some conclusions and recommendations about how to recognize suspicious games or malware. The study focuses on Android free game applications and covers 530 games from Google Play. The study shows that \u27full Internet access\u27 is the most requested permission and that 60% of the requested permissions are of high risk. These results clearly call for more vigilance when installing new games/applications and mandate new solutions to secure Android applications and help end-users choose their games/applications safely

ZU Scholars (Zayed University)

Bot-Mgat: A Transfer Learning Model Based On A Multi-View Graph Attention Network To Detect Social Bots

Author: Alashwal Hany
Alothali Eiman
Hayawi Kadhim
Salih Motamen
Publication venue: ZU Scholars
Publication date: 01/08/2022
Field of study

Twitter, as a popular social network, has been targeted by different bot attacks. Detecting social bots is a challenging task, due to their evolving capacity to avoid detection. Extensive research efforts have proposed different techniques and approaches to solving this problem. Due to the scarcity of recently updated labeled data, the performance of detection systems degrades when exposed to a new dataset. Therefore, semi-supervised learning (SSL) techniques can improve performance, using both labeled and unlabeled examples. In this paper, we propose a framework based on the multi-view graph attention mechanism using a transfer learning (TL) approach, to predict social bots. We called the framework \u27Bot-MGAT\u27, which stands for bot multi-view graph attention network. The framework used both labeled and unlabeled data. We used profile features to reduce the overheads of the feature engineering. We executed our experiments on a recent benchmark dataset that included representative samples of social bots with graph structural information and profile features only. We applied cross-validation to avoid uncertainty in the model\u27s performance. Bot-MGAT was evaluated using graph SSL techniques: single graph attention networks (GAT), graph convolutional networks (GCN), and relational graph convolutional networks (RGCN). We compared Bot-MGAT to related work in the field of bot detection. The results of Bot-MGAT with TL outperformed, with an accuracy score of 97.8%, an F1 score of 0.9842, and an MCC score of 0.9481

ZU Scholars (Zayed University)

Directory of Open Access Journals

SEBD: A Stream Evolving Bot Detection Framework with Application of PAC Learning Approach to Maintain Accuracy and Confidence Levels

Author: Eiman Alothali
Hany Alashwal
Kadhim Hayawi
Publication venue: 'MDPI AG'
Publication date: 01/03/2023
Field of study

A simple supervised learning model can predict a class from trained data based on the previous learning process. Trust in such a model can be gained through evaluation measures that ensure fewer misclassification errors in prediction results for different classes. This can be applied to supervised learning using a well-trained dataset that covers different data points and has no imbalance issues. This task is challenging when it integrates a semi-supervised learning approach with a dynamic data stream, such as social network data. In this paper, we propose a stream-based evolving bot detection (SEBD) framework for Twitter that uses a deep graph neural network. Our SEBD framework was designed based on multi-view graph attention networks using fellowship links and profile features. It integrates Apache Kafka to enable the Twitter API stream and predict the account type after processing. We used a probably approximately correct (PAC) learning framework to evaluate SEBD’s results. Our objective was to maintain the accuracy and confidence levels of our framework to enable successful learning with low misclassification errors. We assessed our framework results via cross-domain evaluation using test holdout, machine learning classifiers, benchmark data, and a baseline tool. The overall results show that SEBD is able to successfully identify bot accounts in a stream-based manner. Using holdout and cross-validation with a random forest classifier, SEBD achieved an accuracy score of 0.97 and an AUC score of 0.98. Our results indicate that bot accounts participate highly in hashtags on Twitter

ZU Scholars (Zayed University)

Directory of Open Access Journals